This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.
Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter. # First part ## data filter and normalization
source("./tianfengRwrappers.R")
载入需要的程辑包:dplyr
载入程辑包:‘dplyr’
The following object is masked from ‘package:matrixStats’:
count
The following object is masked from ‘package:Biobase’:
combine
The following objects are masked from ‘package:GenomicRanges’:
intersect, setdiff, union
The following object is masked from ‘package:GenomeInfoDb’:
intersect
The following objects are masked from ‘package:IRanges’:
collapse, desc, intersect, setdiff, slice, union
The following objects are masked from ‘package:S4Vectors’:
first, intersect, rename, setdiff, setequal, union
The following objects are masked from ‘package:BiocGenerics’:
combine, intersect, setdiff, union
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
载入需要的程辑包:reticulate
载入需要的程辑包:tidyr
载入程辑包:‘tidyr’
The following object is masked from ‘package:S4Vectors’:
expand
载入程辑包:‘MySeuratWrappers’
The following objects are masked from ‘package:Seurat’:
DimPlot, DoHeatmap, LabelClusters, RidgePlot, VlnPlot
载入程辑包:‘cowplot’
The following object is masked from ‘package:ggpubr’:
get_legend
载入需要的程辑包:viridisLite
载入程辑包:‘reshape2’
The following object is masked from ‘package:tidyr’:
smiths
NOTE: Either Arial Narrow or Roboto Condensed fonts are required to use these themes.
Please use hrbrthemes::import_roboto_condensed() to install Roboto Condensed and
if Arial Narrow is not on your system, please see https://bit.ly/arialnarrow
Registered S3 method overwritten by 'enrichplot':
method from
fortify.enrichResult DOSE
clusterProfiler v3.14.3 For help: https://guangchuangyu.github.io/software/clusterProfiler
If you use clusterProfiler in published research, please cite:
Guangchuang Yu, Li-Gen Wang, Yanyan Han, Qing-Yu He. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS: A Journal of Integrative Biology. 2012, 16(5):284-287.
载入程辑包:‘clusterProfiler’
The following object is masked from ‘package:DelayedArray’:
simplify
Registering fonts with R
载入程辑包:‘plotly’
The following object is masked from ‘package:ggplot2’:
last_plot
The following object is masked from ‘package:IRanges’:
slice
The following object is masked from ‘package:S4Vectors’:
rename
The following object is masked from ‘package:stats’:
filter
The following object is masked from ‘package:graphics’:
layout
载入需要的程辑包:e1071
载入程辑包:‘widgetTools’
The following object is masked from ‘package:dplyr’:
funs
载入程辑包:‘DynDoc’
The following object is masked from ‘package:DelayedArray’:
path
The following object is masked from ‘package:BiocGenerics’:
path
载入程辑包:‘DT’
The following object is masked from ‘package:Seurat’:
JS
========================================
circlize version 0.4.13
CRAN page: https://cran.r-project.org/package=circlize
Github page: https://github.com/jokergoo/circlize
Documentation: https://jokergoo.github.io/circlize_book/book/
If you use it in published research, please cite:
Gu, Z. circlize implements and enhances circular visualization
in R. Bioinformatics 2014.
This message can be suppressed by:
suppressPackageStartupMessages(library(circlize))
========================================
载入需要的程辑包:grid
========================================
ComplexHeatmap version 2.2.0
Bioconductor page: http://bioconductor.org/packages/ComplexHeatmap/
Github page: https://github.com/jokergoo/ComplexHeatmap
Documentation: http://jokergoo.github.io/ComplexHeatmap-reference
If you use it in published research, please cite:
Gu, Z. Complex heatmaps reveal patterns and correlations in multidimensional
genomic data. Bioinformatics 2016.
========================================
载入程辑包:‘ComplexHeatmap’
The following object is masked from ‘package:plotly’:
add_heatmap
human_coronary_countmatrix <- read.csv("GSE131778_human_coronary_scRNAseq.txt", sep = "\t")
func <- function(s) {
paste0(strsplit(s, ".", fixed = T)[[1]][2], "_", strsplit(s, ".", fixed = T)[[1]][1])
}
colnames(human_coronary_countmatrix) <- lapply(colnames(human_coronary_countmatrix), func) # 拆分样本
human_coronary <- CreateSeuratObject(counts = human_coronary_countmatrix,
project = "human_coronary", min.cells = 10, min.features = 300) %>%
PercentageFeatureSet(pattern = "^MT-", col.name = "percent.mt") %>%
subset(subset = nFeature_RNA > 600 & nFeature_RNA < 6000 & nCount_RNA > 1000 & nCount_RNA < 30000) %>%
SCTransform(vars.to.regress = "percent.mt", verbose = F) %>%
RunPCA() %>% FindNeighbors(dims = 1:20) %>%
RunUMAP(dims = 1:20) %>%
FindClusters(resolution = 0.1)
rm(human_coronary_countmatrix)
# 批量读取计数矩阵
# 需要把行名的gene删掉,用vscode修改
count_mats <- list.files("./CA_GSE155512")
count_mats <- count_mats[count_mats != "sampleinfo.txt"]
allList <- lapply(count_mats, function(folder) {
CreateSeuratObject(
counts = read.csv(paste0("./CA_GSE155512/", folder), sep = "\t"),
project = folder, min.cells = 10, min.features = 300
)
})
# 合并seurat对象
CA_dataset1 <- merge(allList[[1]],
y = allList[-1], add.cell.ids = count_mats,
project = "CA_dataset1"
)
rm(allList)
CA_dataset1 <- PercentageFeatureSet(CA_dataset1, pattern = "^MT-", col.name = "percent.mt") %>%
subset(subset = nFeature_RNA > 600 & nFeature_RNA < 6000 & nCount_RNA > 1000 & nCount_RNA < 30000) %>%
SCTransform(vars.to.regress = "percent.mt", verbose = F) %>%
RunPCA() %>% FindNeighbors(dims = 1:20) %>%
RunUMAP(dims = 1:20) %>%
FindClusters(resolution = 0.1)
CA_dataset2 <- CreateSeuratObject(Read10X("./CA_GSE159677/"), names.field = 2, names.delim = "-",
project = "CA_dataset2", min.cells = 10, min.features = 300) %>%
PercentageFeatureSet(pattern = "^MT-", col.name = "percent.mt") %>%
subset(subset = nFeature_RNA > 600 & nFeature_RNA < 6000 & nCount_RNA > 1000 & nCount_RNA < 30000) %>%
SCTransform(vars.to.regress = "percent.mt", verbose = F) %>%
RunPCA() %>% FindNeighbors(dims = 1:20) %>%
RunUMAP(dims = 1:20) %>%
FindClusters(resolution = 0.1)
Idents(human_coronary) <- human_coronary$orig.ident
Idents(human_coronary) <- c("1","1","2","2","3","3","4","4")
human_coronary$samples <- Idents(human_coronary)
Idents(human_coronary) <- human_coronary$seurat_clusters
Idents(CA_dataset2) <- CA_dataset2$orig.ident
CA_dataset2 <- RenameIdents(CA_dataset2,'1' = 'AC_1','2' = 'PA_1','3' = 'AC_2','4' = 'PA_2','5' = 'AC_3','6' = 'PA_3')
UMAPPlot(CA_dataset2)
CA_dataset2$sample <- Idents(CA_dataset2)
CA_dataset2 <- RenameIdents(CA_dataset2,'AC_1' = 'AC','PA_1' = 'PA','AC_2'= 'AC','PA_2'= 'PA','AC_3'= 'AC','PA_3'= 'PA')
CA_dataset2$conditions <- Idents(CA_dataset2)
Idents(CA_dataset2) <- CA_dataset2$orig.ident
CA_dataset2 <- RenameIdents(CA_dataset2, '1' = 'sp_1','2' = 'sp_1','3' = 'sp_2','4' = 'sp_2','5' = 'sp_3','6' = 'sp_3')
CA_dataset2$groups <- Idents(CA_dataset2)
Idents(CA_dataset2) <- CA_dataset2$seurat_clusters
saveRDS(human_coronary,"human_coronary.rds")
saveRDS(CA_dataset1,"CA_dataset1.rds")
saveRDS(CA_dataset2,"CA_dataset2.rds") #已经经过分组处理了
Idents(CA_dataset2) <- CA_dataset2$conditions
AC <- subset(CA_dataset2, idents = "AC")
PA <- subset(CA_dataset2, idents = "PA")
AC <- AC %>% RunPCA() %>% FindNeighbors(dims = 1:20) %>%
RunUMAP(dims = 1:20) %>%
FindClusters(resolution = 0.1)
PA <- PA %>% RunPCA() %>% FindNeighbors(dims = 1:20) %>%
RunUMAP(dims = 1:20,seed.use = 20) %>%
FindClusters(resolution = 0.1)
AC <- AC %>% RunUMAP(dims = 1:20,seed.use = 20) %>%
FindClusters(resolution = 0.1)
SMC <- subset(AC,idents = c(1,4))
SMC <- SMC %>% FindNeighbors(dims = 1:20) %>% FindClusters(resolution = 0.2)
Computing nearest neighbor graph
Computing SNN
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 3057
Number of edges: 98364
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.9273
Number of communities: 5
Elapsed time: 0 seconds